Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add filters

Main subject
Language
Document Type
Year range
1.
biorxiv; 2023.
Preprint in English | bioRxiv | ID: ppzbmed-10.1101.2023.05.26.542489

ABSTRACT

With the rapid spread and evolution of SARS-CoV-2, the ability to monitor its transmission and distinguish among viral lineages is critical for pandemic response efforts. The most commonly used software for the lineage assignment of newly isolated SARS-CoV-2 genomes is pangolin, which offers two methods of assignment, pangoLEARN and pUShER. PangoLEARN rapidly assigns lineages using a machine learning algorithm, while pUShER performs a phylogenetic placement to identify the lineage corresponding to a newly sequenced genome. In a preliminary study, we observed that pangoLEARN (decision tree model), while substantially faster than pUShER, offered less consistency across different versions of pangolin v3. Here, we expand upon this analysis to include v3 and v4 of pangolin, which moved the default algorithm for lineage assignment from pangoLEARN in v3 to pUShER in v4, and perform a thorough analysis confirming that pUShER is not only more stable across versions but also more accurate. Our findings suggest that future lineage assignment algorithms for various pathogens should consider the value of phylogenetic placement.

2.
biorxiv; 2023.
Preprint in English | bioRxiv | ID: ppzbmed-10.1101.2023.02.03.527052

ABSTRACT

Pathogen nomenclature systems are a key component of effective communication and collaboration for researchers and public health workers. Since February 2021, the Pango nomenclature for SARS-CoV-2 has been sustained by crowdsourced lineage proposals as new isolates were added to a growing global dataset. This approach to dynamic lineage designation is dependent on a large and active epidemiological community identifying and curating each new lineage. This is vulnerable to time-critical delays as well as regional and personal bias. To address these issues, we developed a simple heuristic approach that divides a phylogenetic tree into lineages based on shared ancestral genotypes. We additionally provide a framework that automatically prioritizes the lineages by growth rate and association with key mutations or locations, extensible to any pathogen. Our implementation is efficient on extremely large phylogenetic trees and produces similar results to existing Pango lineage designations when applied to SARS-CoV-2. This method offers a simple, automated and consistent approach to pathogen nomenclature that can assist researchers in developing and maintaining phylogeny-based classifications in the face of ever increasing genomic datasets.

3.
medrxiv; 2022.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2022.01.07.22268918

ABSTRACT

The unprecedented SARS-CoV-2 global sequencing effort has suffered from an analytical bottleneck. Many existing methods for phylogenetic analysis are designed for sparse, static datasets and are too computationally expensive to apply to densely sampled, rapidly expanding datasets when results are needed immediately to inform public health action. For example, public health is often concerned with identifying clusters of closely related samples, but the sheer scale of the data prevents manual inspection and the current computational models are often too expensive in time and resources. Even when results are available, intuitive data exploration tools are of critical importance to effective public health interpretation and action. To help address this need, we present a phylogenetic summary statistic which quickly and efficiently identifies newly introduced strains in a region, resulting clusters of infected individuals, and their putative geographic origins. We show that this approach performs well on simulated data and is congruent with a more sophisticated analysis performed during the pandemic. We also introduce Cluster Tracker ( https://clustertracker.gi.ucsc.edu/ ), a novel interactive web-based tool to facilitate effective and intuitive SARS-CoV-2 geographic data exploration and visualization. Cluster-Tracker is updated daily and automatically identifies and highlights groups of closely related SARS-CoV-2 infections resulting from inter-regional transmission across the United States, streamlining public health tracking of local viral diversity and emerging infection clusters. The combination of these open-source tools will empower detailed investigations of the geographic origins and spread of SARS-CoV-2 and other densely-sampled pathogens.


Subject(s)
COVID-19
SELECTION OF CITATIONS
SEARCH DETAIL